Detection of a watermark in a digital signal

ABSTRACT

Watermark detectors have a buffer in which a number of image tiles are folded and accumulated prior to computing the correlation between buffer contents and the watermark pattern being looked for. The intention of the folding and accumulation process is to average out the video content while accumulating the embedded watermark energy. This no longer appears to hold for strongly compressed video, such as DIVX, which exhibits a lot of artificial noise and undesired similarity (block patterns). As a result thereof, correlation peaks are often below the threshold. In a similar manner, the compression affects scale detection. According to this invention, only frames (or parts thereof) that are not so heavily compressed and therefore have a high probability of carrying enough watermark energy are folded and accumulated. To this end, a quality metric is calculated, the quality metric being indicative of the degree of compression of the data. The quality metric may be calculated based on the compressed data itself or derived from the decompressed base-band data. An advantageous example is the number of non-zero DCT coefficients of a (residue) frame. A determination is then made as to whether to exclude the frame (or part thereof) from the watermark decode process. The quality metric may also be used to select data for use in a scale detection process.

The present invention relates to the detection of watermark signalsembedded in digital data, the data typically representing multimediacontent. A typical format for such data is MPEG2, although the inventionmay be used with other formats also.

In order to embed certain information, such as copyright, copy control,source or authentication data into a digital signal, a technique knownas watermarking is often used. This involves processing the digital dataso that a recognizable pattern is ‘overlaid’ onto the data to bewatermarked. Different types of watermark have different uses. A simplerobust watermark, which is intended to survive a wide range ofprocessing steps in the analogue and digital domains, may simplyindicate that the watermarked data is subject to copyright, and mayprovide further details, such as owner and date. A fragile watermark isoften added in such a way that it is corrupted or broken if the data isprocessed in any way. In this way, the absence of a fragile watermark ina data file, or stream, in which one was expected, can indicate that thedata has been processed or otherwise tampered with. This can be usefulin medical or forensic science applications where authenticity iscrucial.

The various types of watermark pattern themselves consist of apseudo-noise signal which is overlaid onto, or woven into, the dataitself. The watermark signal should ideally not degrade the source datain a perceptible manner, but should be detectable by a suitable decoder.

A particular problem arises when the watermarked data is compressed to avery low bit-rate, suitable for transmission over the Internet, or otherdata transfer system. DIVX is one system which produces very lowbit-rates, and is widely used to reduce the amount of bandwidth requiredto transmit video images over the Internet.

Currently used watermarking systems such as JAWS (Ton Kalker, GeertDepovere, Jaap Haitsma, Maurice Maes, “A Video Watermarking System forBroadcast Monitoring”, Proceedings of SPIE Electronic Imaging '99,Security and watermarking of Multimedia Contents, San Jose (Calif.),USA, Jan. 1999) use detectors which search for embedded watermarks bycollecting large amounts of video data, which is then folded andaccumulated before the accumulated data is correlated with the expectedwatermark pattern. With video data that has been compressed to a verylow bit-rate, e.g. using DIVX, a frequently encountered result iscorrelation peaks which occur below the detection threshold. This meansthat detection of the embedded watermark(s) may fail, which can causeinconvenience for users of the system who may be authorized to view thewatermarked video, but are prevented from doing so in the absence of aproper detection of the watermark(s).

A further problem occurs when the watermarked video has been scaled orre-sized. In order to detect an embedded watermark, the original scaleof the video signal is required, so that the accumulation buffer, whichcaptures incoming video data, can be correspondingly scaled to theoriginal video dimensions. The original scale must be determined fromthe scaled video data itself. Compared to the watermark detectionprocess, where the video data is correlated against known watermarkdata, prior art scale-detection processes operate by correlating twonoisy accumulation buffers with each other to yield the scale factor.

In the JAWS system, watermark detection and the watermark detectionprocess and the scale retrieval process make use of a repetitivewatermark pattern being embedded in the source data. During thewatermark embedding process, a 128×128 watermark pattern is ‘tiled’ overthe full extent of a frame of data.

In order to retrieve the horizontal scale information from a scaledversion of the data, the process begins by arbitrarily selecting twohorizontally adjacent tiles A and B from a number of accumulated frames.The two tiles are then correlated with each other according to thefollowing steps:

-   -   Calculate 128×128 Hanning window over A and B; Han(A), Han(B)        -   A Hanning window is a kind of filter which acts to ‘fade            out’ the edges of the tile to which it is applied. In this            way, the data in the centre of the tile is preserved, but            closer to the edges, the data fades to zero. This alleviates            the effect of edges introducing strong artificial frequency            components in the ensuing FFT calculation.    -   Calculate 128×128 Fast Fourier Transform (FFT) over A and B    -   Calculate complex conjugate of Han(B); Con(Han(B))    -   Calculate pointwise multiplication of Han(A) and Con(Han(B))    -   Normalise multiplication result. This is done according to the        following formula for each complex value (z) in the result, so        that z is replaced by

$\frac{z}{\sqrt{{{re}(z)}^{2} + {{im}(z)}^{2}}}$

-   -   Calculate Inverse FFT of previous step

The position of the highest value in the first row of the IFFT result isthen used to calculate the horizontal scale factor. If the first valueis the highest, then the horizontal scaling factor is 1 i.e. no scalinghas occurred.

The vertical scaling factor is calculated in a similar way, but twovertically adjacent tiles and the first column of the IFFT result areused instead.

The correlation peaks for this scale retrieval process are even lowerthan for the watermark detection process due to the inherently morenoisy buffer samples used. (Watermark detection involves a correlationbetween a known pattern and a noisy accumulation buffer: scale detectionis a correlation between two noisy accumulation buffers). To furthercomplicate matters, frame folding may not be used in the scale detectionprocess. This is because frame folding can only be used if the scale isknown. If the scale is not known, patterns are accumulated that are notsynchronised and the resulting accumulation buffer is useless. As aresult, only accumulation can be used. This means that more frames mustbe collected before correlation can be performed, which, of course,takes more time.

Folding works by ‘magnifying’ the watermark data, as it always has thesame sign. The underlying video signal is effectively ‘random’ and soaveraged out. Folding for long enough results in the original watermarkpattern. However, if the patterns (tile of 128×128) are not exactlyaligned the process does not work.

Prior art techniques attempt to alleviate these problems by accumulatingmore frames per detection in the hope that the video data averages outand the watermark signal amplifies, so that the signal (watermark) tonoise (video) ratio increases.

In a typical scale-detection, up to 300 frames are currently used.However, in the case of DIVX compressed video, a lot of artificial noiseand undesired similarity, caused by block patterns, is introduced.During the accumulation process, more noise than watermark energy isgenerally accumulated. Also, the undesired patterns are amplified aswell, and are usually stronger than the watermark signal. All theseproblems make reliable scale-detection of DIVX video difficult, andoften impossible. Without reliable scale detection, watermark detectionis not possible.

An object of embodiments of the present invention is to at leastalleviate the above mentioned problems experienced with prior artdetection systems, and provide a better watermark detection system foruse with highly compressed video or other multimedia data.

A further object of embodiments of the present invention is to allow theperformance of a more reliable scale detection process before watermarkdetection is carried out.

According to the present invention, there is provided a method ofselecting data for use in decoding an embedded watermark in compressedmultimedia data, comprising the steps of:

-   -   calculating a quality metric for a given part of the compressed        multimedia data based on the degree of compression of the        multimedia data;    -   including in a watermark decoding process, the given part, if        its quality metric is higher than a certain threshold, and;    -   excluding from the watermark decoding process, the given part,        if its quality metric is lower than the threshold.

Preferably the method further includes the step of using the samequality metric to select data to use in a scale-detection processperformed before the watermark decoding process. In cases where noscaling has taken place, this will return a scale factor of 1.Otherwise, the scale-detection process will return a value which allowsaccumulation buffers to be sized appropriately before a watermark isdecoded.

Preferably, the quality metric is calculated on the basis of an analysisof a compressed data stream. Such a compressed data stream is providedby DIVX systems.

Suitably, in cases where access to the compressed data stream ispossible, the quality metric may be determined on the basis of one of:Quantisation factors; the number of Variable Length Codewords (VLCs)used to code a data frame; Motion Vectors.

The quality metric may also be calculated on the basis of a plurality ofparameters.

Preferably, the quality metric may be calculated on the basis of ananalysis of base-band data.

Preferably the quality metric is calculated on a measure of the energyof a frame.

The quality metric may also be calculated on the basis of a plurality ofparameters.

Preferably, the given part of the data is a frame. Alternatively,part-frames may also be used.

Preferably, apparatus is provided to perform the method according to theinvention.

For a better understanding of the present invention, and to understandhow the same may be brought into effect, the invention will bedescribed, by way of example only, with reference to the appendeddrawings in which:

FIG. 1 shows a schematic representation of an embodiment of the presentinvention.

FIG. 1 shows a schematic representation of the data flow in anembodiment of the invention. A data buffer 10 is arranged to receive anincoming data stream 110. The data stream 110 is, in a particularembodiment, a DIVX coded video data stream. Data buffer 10 operates toselect all or part 120 of a frame of the incoming data stream, which isthen analysed in quality metric calculator 20. Quality metric calculatoroperates on the data frame (or part thereof) 120 to establish a qualitymetric 130 of the input data frame 120. The quality metric is indicativeof the likelihood of the particular frame including sufficient watermarkenergy to be used in the watermark decoding process. Methods ofcalculating the quality metric will be presented shortly.

The quality metric 130 is compared with a pre-defined level in thresholddetector 30. If the quality metric indicates a high probability of theframe 120 including a suitable quantity of watermark energy, then theframe 120 is made available to the watermark detection process 40.

If, however, quality metric 130 falls below the pre-defined acceptablelevel, the threshold detector discards 50 the data in frame 120 and itwill play no part in the watermark decoding process 40.

In this way, only data which has a higher probability of includingsufficient watermark energy to enable a successful decode of thewatermark to be performed is passed to the watermark decoding process.The output of the watermark decode process is watermark 140.Alternatively, the output 140 could be a binary signal indicating eithera correct decode or that no watermark was detected.

In order to determine a quality metric (Q), one or more characteristicsof the data is assessed or measured. The following examples highlightattributes which may be used in some situations. The skilled man will beaware of other attributes which may form the basis of a quality metriccalculation in other situations.

The quality metric (Q) effectively provides a measure of how much thesubject data has been compressed. The more compressed the data, theharder it is to extract the watermark from it.

If access to the compressed data stream is possible, there are severalparameters available from the stream itself which may be used in orderto determine a quality metric (Q). Some suitable parameters are:

-   -   Quantisation Factors    -   The number of Variable Length Codewords (VLCs) used to code a        frame    -   Motion Vectors

In a system where access to the compressed data stream is possible, aquality metric may be derived by counting the number of VLCs used tocode a frame. In this case, only frames coded with more than 5000coefficients are folded and used in the watermark detection process.

In many instances, however, access to the original compressed stream isnot possible and only access to the base-band video signals is possible,for example. In such instances, access to the previously mentionedparameters is not possible and so different measures may be used todetermine Q. One such measure is:

-   -   A measure of Energy. Such a measure can be obtained, for        example, by 8×8 DCT transforming blocks of a frame, quantise the        coefficients with a coarse standard MPEG Quantisation matrix,        and count the number of non-zero coefficients. The non-zero        coefficients of a block are indicative for its energy content.        If there are many high coefficients around DC frequency, this        indicates that there are sharp edges in the block. A lot of        non-zero coefficients means that the block has a complex        structure. If there are no AC coefficients, this means that the        block is flat In general, the more non-zero coefficients there        are, the more watermark energy there is likely to be available        in the block.

Once a suitable quality metric (Q) has been calculated from one or moregiven attributes of the signal, it is possible to establish a thresholdfor a particular value of Q, such that data frames (or parts thereof)having a value of Q which falls below the threshold, can be discardedfor the purpose of decoding an embedded watermark. The actual data frame(or part thereof) is of course retained so that its inherent datacontent (e.g. video) can be decoded.

The establishment of a threshold depends on the particular attribute ofthe data signal which was chosen as the basis of the quality metric, andmay best be determined in a particular case by experimentation.

As stated previously, a further problem arises when the compressed videosignal has been scaled. Before the watermark can be decoded from thecompressed signal, the original scale of the signal has to be recovered.

Embodiments of the present invention operate to recover scaleinformation in a similar way to that just described to recover watermarkinformation. To recover scale information, two accumulation buffers arecorrelated, with the resultant correlation giving a direct indication ofthe scale factor.

In order to improve the results of the correlation process, the samequality metric (Q) calculated above can be used to identify candidateframes (or parts thereof) which are less heavily compressed, and thushave a higher Q. These candidate frames can be used for thescale-determining correlation process in preference to frames (or partsthereof) which are more heavily compressed, and thus have a lower Q.

Experiments have shown that the scale detection process is greatlyimproved by being selective about which data samples are used in thecorrelation process. In cases where the correlation peaks wouldotherwise be below a defined detection threshold using prior artmethods, making scale detection impossible, it is found that embodimentsof the invention are able to determine scale factors by selectivelydiscarding certain data samples which do not contribute to a successfulcorrelation.

In effect, the same technique may be used firstly to discover the scalefactor of the compressed signal, which can then be used to scale theaccumulation buffer appropriately and, secondly, to enable a morereliable watermark decode to take place.

Embodiments of the invention may be implemented using suitablyconditioned or programmed hardware. Such hardware may includespecialised hardware such as a custom ASIC, or a more general processoror DSP including operating according to a suitable program.

The skilled man will be aware of other parameters which may be used asthe basis for calculating a quality metric, and the examples illustratedherein are not intended to limit the scope of the present invention,which is to be determined by the appended claims.

1. A computer based method of selecting data for use in decoding anembedded watermark in compressed multimedia data, comprising:calculating a quality metric for a given part of the compressedmultimedia data, based on the degree of compression of the multimediadata; including in a watermark decoding process, the given part, if itsquality metric is higher than a certain threshold, and; excluding fromthe watermark decoding process, the given part, if its quality metric islower than the threshold, wherein the method additionally includes usingthe same quality metric to select data to use in a scale-detectionprocess performed before the watermark decoding process.
 2. A computerbased method of selecting data for use in decoding an embedded watermarkin compressed multimedia data, comprising: calculating a quality metricfor a given part of the compressed multimedia data, based on the degreeof compression of the multimedia data; including in a watermark decodingprocess, the given part, if its quality metric is higher than a certainthreshold, and; excluding from the watermark decoding process, the givenpart, if its quality metric is lower than the threshold, wherein thequality metric is calculated on the basis of one of the followingparameters associated with the compressed data stream: Quantisationfactors; the number of Variable Length Codewords (VLCs) used to code adata frame; Motion Vectors.
 3. The method as claimed in claim 2 whereinthe quality metric is calculated on the basis of a plurality ofparameters associated with the compressed data stream.
 4. A computerbased method of selecting data for use in decoding an embedded watermarkin compressed multimedia data, comprising: calculating a quality metricfor a given part of the compressed multimedia data, based on the degreeof compression of the multimedia data; including in a watermark decodingprocess, the given part, if its quality metric is higher than a certainthreshold, and; excluding from the watermark decoding process, the givenpart, if its quality metric is lower than the threshold, wherein thequality metric is calculated on the basis of an analysis of base-banddata.
 5. The method as claimed in claim 4 wherein the quality metric iscalculated on the basis of a measure of the energy of a frame.
 6. Themethod as claimed in claim 5 wherein the quality metric is calculated onthe basis of a plurality of parameters associated with the base-banddata.